Name | Version | Summary | date |
semantic-text-splitter |
0.13.0 |
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rust and Python. |
2024-05-05 23:06:54 |
taibun |
1.1.1 |
Taiwanese Hokkien Transliterator and Tokeniser |
2024-05-01 20:31:54 |
example990420 |
1.1.1 |
Taiwanese Hokkien Transliterator and Tokeniser |
2024-05-01 20:28:38 |
gpt3_tokenizer |
0.1.5 |
Encoder/Decoder and tokens counter for GPT3 |
2024-04-26 18:07:33 |
bleuscore |
0.1.1 |
A fast(not yet :) bleu score calculator |
2024-04-26 07:54:55 |
pyregtokenizer |
0.0.2 |
A BPE Tokenizer using regex |
2024-04-21 07:30:19 |
tokenizers |
0.19.1 |
None |
2024-04-17 21:40:41 |
ipatok |
0.4.2 |
IPA tokeniser |
2024-04-07 13:51:48 |
UniTok |
3.5.2 |
Unified Tokenizer |
2024-03-25 06:28:05 |